Multiword Expressions in NLP: General Survey and a Special Case of Verb-Noun Constructions

نویسنده

  • Olga Kolesnikova
چکیده

This chapter presents a survey of contemporary NLP research on Multiword Expressions (MWEs). MWEs pose a huge problem to precise language processing due to their idiosyncratic nature and diversity of their semantic, lexical, and syntactical properties. The chapter begins by considering MWEs definitions, describes some MWEs classes, indicates problems MWEs generate in language applications and their possible solutions, presents methods of MWE encoding in dictionaries and their automatic detection in corpora. The chapter goes into more detail on a particular MWE class called Verb-Noun Constructions (VNCs). Due to their frequency in corpus and unique characteristics, VNCs present a research problem in their own right. Having outlined several approaches to VNC representation in lexicons, the chapter explains the formalism of Lexical Function as a possible VNC representation. Such representation may serve as a tool for VNCs automatic detection in a corpus. The latter is illustrated on Spanish material applying some supervised learning methods commonly used for NLP tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Domain-Dependent Identification of Multiword Expressions

The identification of different kinds of multiword expressions require different solutions, on the other hand, there might be domain-related differences in their frequency and typology. In this paper, we show how our methods developed for identifying noun compounds and light verb constructions can be adapted to different domains and different types of texts. Our results indicate that with littl...

متن کامل

MWEs in Treebanks: From Survey to Guidelines

By means of an online survey, we have investigated ways in which various types of multiword expressions are annotated in existing treebanks. The results indicate that there is considerable variation in treatments across treebanks and thereby also, to some extent, across languages and across theoretical frameworks. The comparison is focused on the annotation of light verb constructions and verba...

متن کامل

How to Account for Idiomatic German Support Verb Constructions in Statistical Machine Translation

Support-verb constructions (i.e., multiword expressions combining a semantically light verb with a predicative noun) are problematic for standard statistical machine translation systems, because SMT systems cannot distinguish between literal and idiomatic uses of the verb. We work on the German to English translation direction, for which the identification of support-verb constructions is chall...

متن کامل

Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions

Light verb constructions (LVCs), such as take a walk and make a decision, are a common subclass of multiword expressions (MWEs), whose distinct syntactic and semantic properties call for a special treatment within a computational system. In particular, LVCs are formed semi-productively: often a semantically-general verb (such as take) combines with a number of semantically-similar nouns to form...

متن کامل

Using Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality

We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016